ggplot2: Basics

ggplot2: Das Paket


ggplot2 gehört zum tidyverse

# install.packages("tidyverse")
library(tidyverse)


… kann aber natürlich auch seperat geladen werden:

# install.packages("ggplot2")
library(ggplot2)

The Big picture

Start: ggplot()

ggplot()

Komponenten

  1. Daten.
  2. Aesthetic mapping zwischen Daten und visuellen Eigenschaften.
  3. [Layer(s)] zum rendern der Daten.

Daten vorstellen

library(jsonlite)
library(tidyverse)

get_gapminder <- function(repo = "https://github.com/open-numbers/ddf--gapminder--fasttrack",
                          keywords = "co2") {
  ## Get json
  ## Filter paths there
  ## use them for download directly

  path_json <- "/refs/heads/master/datapackage.json"
  raw_url <- gsub("github.com", "raw.githubusercontent.com", "https://raw.githubusercontent.com/open-numbers/ddf--gapminder--fasttrack")
  download.file(url = paste0(raw_url, path_json), destfile = "datapackage.json")

  json_data <- jsonlite::fromJSON("datapackage.json")

  csv_paths <- json_data$resources$path
  matched_paths <- csv_paths[str_detect(csv_paths, str_c(keywords, collapse = "|"))]

  if (length(matched_paths) == 0) {
    stop("No files matched the specified keywords.")
  }

  matched_paths_url <- paste0(raw_url, "/refs/heads/master/", matched_paths)

  merged_df <- read.csv(matched_paths_url[1])

  # Loop through and merge the rest
  if (length(matched_paths) > 1) {
    for (i in 2:length(matched_paths)) {
      message("Reading file: ", matched_paths_url[i])
      temp_df <- read_csv(matched_paths_url[i])

      merged_df <- full_join(merged_df, temp_df)
      rm(temp_df)
      gc()
    }
  }

  if (file.exists("datapackage.json")) {
    file.remove("datapackage.json")
  }

  return(merged_df)
}

co2_gapminder <- get_gapminder(keywords = c("pop--", "co2"))

Daten

ggplot(data = movies_metadat)

Aesthetic mapping

Um diese leere Leinwand zu befüllen, müssen wir die Daten mit den benötigten visuellen Eigenschaften verknüpfen:

mapping = aes()

Je nach Plot-Art sind verschiedene visuelle Eigenschaften möglich. Wichtig ist für uns jetzt erst einmal die Position, also x - und y-Achsen.
Es kann hier aber z.B. auch die Farbe der Punkte in Agnhängikeit von Kategorien in den Daten geändert werden.

Aesthetic mapping: Achsen

ggplot(
  data = movies_metadat,
  mapping = aes(
    x = budget,
    y = vote_average
  )
)

Geometric Layers

ggplots sind aus verschiedenen Layern aufgebaut, die mithilfe eines + übereinander gelegt werden.

geom_

Layers

ggplot(
  data = movies_metadat,
  mapping = aes(
    x = budget,
    y = vote_average
  )
) +
  geom_point()

Mehr Layers!

ggplot(
  data = movies_metadat,
  mapping = aes(
    x = budget,
    y = vote_average
  )
) +
  geom_point() +
  geom_smooth()

Titel/Labels

ggplot(
  data = movies_metadat,
  mapping = aes(
    x = budget,
    y = vote_average
  )
) +
  geom_point() +
  geom_smooth() +
  labs(
    title = "Getting a bang for your buck: Are Movies with higher budget also better?",
    subtitle = "There doesn't seem to be a strong relation between movie budget and average rating.",
    x = "Movie budget",
    y = "Average vote"
  )

Style deinen Plot: Themes

ggplot(
  data = movies_metadat,
  mapping = aes(
    x = budget,
    y = vote_average
  )
) +
  geom_point() +
  geom_smooth() +
  labs(
    title = "Getting a bang for your buck: Are Movies with higher budget also better?",
    subtitle = "There doesn't seem to be a strong relation between movie budget and average rating.",
    x = "Movie budget",
    y = "Average vote"
  ) +
  theme_classic()

Übung

Let’s take a deeper dive

Hier dann nochmal genauer durchgehen - Was haben wir eigentlich gemacht. Nicht zu sehr in den Basics verlieren, auch schneller tiefer reingehen (scales, coord system …)

Abspeichern

Farben

https://questionsindataviz.com/2023/12/29/what-makes-a-truly-terrible-map/